-
-
Notifications
You must be signed in to change notification settings - Fork 33.1k
gh-140149: use PyBytesWriter
in action_helpers.c
's _build_concatenated_bytes
; 3x faster bytes
concat in the parser
#140150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
gh-140149: use PyBytesWriter
in action_helpers.c
's _build_concatenated_bytes
; 3x faster bytes
concat in the parser
#140150
Conversation
PyBytesWriter
in action_helpers.c
's _build_concatenated_bytes
; 3x speed up for bytes
concat in the parserPyBytesWriter
in action_helpers.c
's _build_concatenated_bytes
; 3x faster bytes
concat in the parser
Parser/action_helpers.c
Outdated
PyBytes_Concat(&res, elem->v.Constant.value); | ||
Py_ssize_t part = PyBytes_GET_SIZE(elem->v.Constant.value); | ||
if (part > 0) { | ||
memcpy(out, PyBytes_AS_STRING(elem->v.Constant.value), part); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not using PyBytesWriter_WriteBytes
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyBytesWriter_WriteBytes()
grows the buffer if needed. It's not needed since the code already computes the total size in advance.
Parser/action_helpers.c
Outdated
PyBytes_Concat(&res, elem->v.Constant.value); | ||
Py_ssize_t part = PyBytes_GET_SIZE(elem->v.Constant.value); | ||
if (part > 0) { | ||
memcpy(out, PyBytes_AS_STRING(elem->v.Constant.value), part); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyBytesWriter_WriteBytes()
grows the buffer if needed. It's not needed since the code already computes the total size in advance.
This optimization is good to have, but I don't think that users will notice since the parser is only run once at Python startup. I don't think that it's worth it to document this optimization. |
I concur, but on the other hand it doesn't hurt |
Co-authored-by: Victor Stinner <[email protected]>
Co-authored-by: Victor Stinner <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Misc/NEWS.d/next/Core_and_Builtins/2025-10-15-17-12-32.gh-issue-140149.cy1m3d.rst
Outdated
Show resolved
Hide resolved
…e-140149.cy1m3d.rst
PyObject* kind = asdl_seq_GET(strings, 0)->v.Constant.kind; | ||
PyObject *kind = asdl_seq_GET(strings, 0)->v.Constant.kind; | ||
|
||
Py_ssize_t total = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bit of a meta question: How much performance change is it to not pre-calculate the length? The precalculation + memcpy
makes this code quite a bit more complex. If it's only a couple percentage difference (so most the 3x is kept) the simpler code for some of these would be nice
The issue gh-140149 provides more details.
This effectively makes
bytes
concatenation about 3x faster in the parser, syntax like:Benchmark
The script:
The results (with
--rigorous
, on 9955759):The environment:
sudo ./python -m pyperf system tune
ensured.